Automated Detection of Martian Dune Fields Using a Convolutional Neural Network

A workflow by Cole Speed and Yiran Shen, University of Texas at Austin

Aeolian sand dunes are prevalent landforms on the surface of Mars. While attempts have been made to map their distribution in satellite images, it is likely that many remain unmapped. In this project, we seek to develop a method to automatically identify dune fields on the surface of mars using deep learning. Specifically, we hope to achieve this through an image segmentation approach use of a convolutional neural (UNet) via a collection of images captured by the Mars Reconnaissance Orbiter.

Left. Image of the Red Planet (Credit: NASA). Right. False-color image shows 'sea' of dunes near Mars' northern polar cap (Credit: NASA).

Left. Image captured by the Mars Reconnaisance Orbiter, shows barchan dunes (Credit: NASA/JPL/University of Arizona/USGS).Right. First dunes studied up-close, known as "High Dune" on NASA's Curiosity rover (Credit:NASA/JPL-Caltech).

Background and Significance of Dune Fields on Mars

Formation of Dunes on Mars and Variety of Dunes on the Martian Surface

Deposits on the surface of Mars provide the only available record of its temporal evolution. Many relict features remain well preserved on the Martian surface, documenting periods of actively flowing water and Martian wind regimes. One such landform-type are aeolian dunes, which form and evolve due to wind-driven sediment transport. A variety of dune types including those commonly seen on Earth (barchan, linear, cresentic) are also observed on the Martian surface. However, a variety of dune geometries exist on Mars that are not found on Earth. The distribution, configuration, and types of dunes on Mars provide history of changes on the Red Planet. NASA has even developed a way of monitoring dune movement throught time across the Martian surface!

Goal: Dune Field Detection and Classification

We would like to develop an automated approach for detecting and classifying martian dune fields through the application of deep learning and image segmentation. We will do so through the use of a convolutional neural network trained using hand-mapped dune field locations and high-resolution images captured by the THEMIS IR camera on the Mars Reconnaisance Orbitor.

Available Datasets

Imagery

The Mars Reconnaisance Orbitor provides a rich dataset of visible imagery and thermal data from the Martian surface. The Thermal Emission Imaging System (THEMIS) IR camera affixed to the Mars Reconnaisance Orbiter provides the data used in this study. A global mosaic of THEMIS imagery sampled at 100x100m per pixel resolution is the primary dataset for model training, validation, and testing.

Left. Mars Reconnaisance Orbiter, THEMIS camera imagery acquisition (Credit: NASA). Right. Global THEMIS Mosaic has 100x100m per pixel spatial resolution (Credit: NASA).

Hand-mapped Dune Polygons

Previous workers have mapped the locations of dune fields as polygons based on visual inspection of imagery from the Mars Reconnaisance Oribiter. These data and information on the location of mapped Martian dune fields have been compiled into the Mars Global Digital Dune Database: MC2–MC29. However, many dune fields on the Martian surface remain unmapped.

Previous work: Deep Learning Applied to Planetary Surface Processes

Deep learning, and specifically convolutional neural networks, in the scope of planetary surfaces and surface processes has been applied primarily to automatically identify and measure craters on both the Moon and Mars (Lee, 2019; Silbert et al., 2019). However, automatic detection and classification of dune fields using CNNs has not been achieved.

We decided to use the UNet architecture to approach our problem after seeing its success in identifying and mapping craters on the Moon and Mars (DeepMoon and DeepMars Github repositories)(Lee, 2019; Silbert et al., 2019), which proved successful to determine the positions and sizes of craters from Lunar and Martian digital elevation maps.

Their UNet convolutional neural network (and by design our neural network) is provided in the schematic below.

In our project we implement a custom version of the UNET architecture (Ronneberger et al., 2015). This architecture consists of a contracting path (left side) and expansive path (right side), joined through multi-level skip connections (middle). Martian THEMIS images are input to the contracting path and predictions are output from a final layer following the expansive path.

In the above schematic, boxes represent cross-sections of square feature maps. Each map's dimensions are indicated on its lower left, and its number of channels are indicated above it. In this diagram, the leftmost map is a 256 × 256 grayscale image sampled from the digital elevation map of the Moon, and the rightmost the CNN's binary ring mask prediction. We employ the same approach for highlighting regions that correspond to dune fields in the THEMIS imagery.

Exploring the Data and Data Preprocessing

Prior to model construction, we visualize the data using mapping software (QGIS). We see that the dunes are strikingly different than the landscape surrounding them. We also notice that many dune fields are located within craters.

Types of dunes seen in the THEMIS Imagery

Dune fields are clearly imaged by the THEMIS IR camera. Because the images are high-resolution (100x100m per pixel) individual dunes are clearly distinguishable.

However, we observe a wide degree of variability in dune field geometry and prevalence. We will have to see if the CNN can identify this variability!

Training data: Satellite imagery from the Mars Reconnaisance Orbiter

Training images were derived from the THEMIS Day IR 100m Global Mosaic, which provides imagery across the entire Martian surface at 100 meters per pixel. The full imagery dataset is ~22Gb

Label data

Hand-mapped dune fields constitute the label data for this study. These data were obtained from the Global Mars Dune Database, in which 550 dune fields have been mapped by hand based on visual interpretation of satellite imagery.

Data preprocessing

A significant amount of time for this project went into the data collection, cleaning and preparation.

We first reprojected the label shapefiles in QGIS to ensure consistent coordinate reference systems between the labels and images. Mars Simple Cylindrical coordinates were used.

A binary raster was generated from the label feature class where dune field polygon pixels have values of 1 while other pixels have Null values.

Due of the file size of the original THEMIS mosaic (~20Gb), we first cropped the satellite image and label rasters to a region which included the majority of the mapped dune field locations See Above.

We then generated image and label 'tiles' with dimensions of 256x256 pixels using a series of python scrips in combination with the open-source GNU IMAGE MANIPULATION PROGRAM (GIMP). These scripts can be found on Github.

This workflow resulted in 78,337 image and label tiles. For training purposes, we filtered the image tiles to include only images that contained at least one (or a portion of one) dune field. This resulted in a total dataset of 1116 image/label pairs.

Importing and visualizing the data

The input data file mars_dunes.npz (about 1Gb) is available at https://drive.google.com/file/d/1mBn8mQ1O51EprTpXSNQtK-VSmF3wD7Z-/view?usp=sharing

Using Google Drive, you can create a shortcut for a shared file:

  1. Right click the file.
  2. Click Add shortcut to Drive.

The are 1116 two-dimensional training images with dimensions $256 \times 256$. We have only included image tiles that have labels, which excludes a large portion of images tiles. Below, we visualize a random sampling of image tiles to better understand our training dataset.

We can also look at the corresponding labels that will be used in the training and validation of our neural network. We visualize the labels corresponding to the same random sampling of dune fields below.

Notice that most image/label tiles only capture a portion of a dune field (some only a very small portion). Below we visualize the image data and labels together.

Preparing data for training

To prepare the data for training, we will first normalize each image by subtracting the mean value and dividing by the standard deviation, resulting in each image having a mean of 0 and a standard deviation of 1. A fourth dimension of 1 is added to indicate that the input dataset has one channel.

We randomly split the data for training (80%) and validation (20%).

Designing the convolutional neural network (Unet)

We are going to use a convolutional neural network (CNN) accelerated on GPUs.

When used Google Colab for this project, utilizing their GPU for accelerating the training of our CNN.

Now we can design our Unet-based neural network.

We take a look at the model constuction, below.

Our model has 1,968,225 total parameters.

Model training

Now we are ready to train our model. As before, we will use orginal size (1116 image tiles) of the dataset, cross-entropy as the loss function (applied pixel-by-pixel in the image), and Adam as the optimization algorithm.

Data augmentation: A larger training dataset and (hopefully) improved training

More data is always (usually?) better. We now extend the dataset to be four times larger than our orginal one by flipping and rotating the dataset.

Cross entropy balancing as the other loss function in our model

Since it is evident that the number of background shapes in any given detection image is usually more than the number of dune class instances, training a neural network might make it biased to outputting background class more compared to dune class and it may affects the performance of a neural net. The problem of class imbalance can be solved by adding more instances of the less dominant class in training data.

Save the model

We save our model termed "deepdunes_model.h5", so that it can be reload throught the Google driver for further testing.

Checking the results compared to the hand-mapped dune field locations

Testing on full image dataset to identify additional dune field candidates

We would like to know if there are any unmapped dune fields on Mars. To find potential candidates we can apply our model to the full unlabeled dataset (~76,000) 256 x 256 image tiles. We had to divide these into 38 files containing 2,000 tiles per file due to RAM and disk limitations. A shortcut to the image files can be found here. The model 'deepdunes_model.hf' can also be downloaded (or a shortcut can be made) here

We now use our model to 'detect' pixels that may correspond to unmapped dune fields. This is only a preliminary result and needs more work!

Future work

Our training dataset included 454 hand-mapped dune fields. Using our model, we identify additional regions which may correspond to dune fields over the same region. Examples of dune fields that our model identified that were not hand-mapped are plotted above.

Next steps will involve transforming the predicted/possible dune fields back into geographic coordinates for visual inspection in GIS software. This will involve stitching image tiles together or creating polygons from the mapped dune field candidates and QC.

Summary

We have developed a deep learning workflow that utilized the UNet convolutional neural network architecture to first test the feasibility of automated detection of dune fields on Mars from satellite images. Our final model predicts the location, geometry, and extent of dune fields with a validation accuracy of 0.97. We then apply our model to the full dataset and identify areas that may correspond to additional unmapped dune fields on Mars. This project provides a proof-of-concept that CNNs may provide an automated way of locating and mapping the extent of dune fields on the Martian surface.